Addressing Beacon re-identification attacks: quantification and mitigation of privacy risks

نویسندگان

  • Jean Louis Raisaro
  • Florian Tramèr
  • Zhanglong Ji
  • Diyue Bu
  • Yongan Zhao
  • W. Knox Carey
  • David D. Lloyd
  • Heidi Sofia
  • Dixie Baker
  • Paul Flicek
  • Suyash S. Shringarpure
  • Carlos D. Bustamante
  • Shuang Wang
  • Xiaoqian Jiang
  • Lucila Ohno-Machado
  • Haixu Tang
  • XiaoFeng Wang
  • Jean-Pierre Hubaux
چکیده

The Global Alliance for Genomics and Health (GA4GH) created the Beacon Project as a means of testing the willingness of data holders to share genetic data in the simplest technical context-a query for the presence of a specified nucleotide at a given position within a chromosome. Each participating site (or "beacon") is responsible for assuring that genomic data are exposed through the Beacon service only with the permission of the individual to whom the data pertains and in accordance with the GA4GH policy and standards.While recognizing the inference risks associated with large-scale data aggregation, and the fact that some beacons contain sensitive phenotypic associations that increase privacy risk, the GA4GH adjudged the risk of re-identification based on the binary yes/no allele-presence query responses as acceptable. However, recent work demonstrated that, given a beacon with specific characteristics (including relatively small sample size and an adversary who possesses an individual's whole genome sequence), the individual's membership in a beacon can be inferred through repeated queries for variants present in the individual's genome.In this paper, we propose three practical strategies for reducing re-identification risks in beacons. The first two strategies manipulate the beacon such that the presence of rare alleles is obscured; the third strategy budgets the number of accesses per user for each individual genome. Using a beacon containing data from the 1000 Genomes Project, we demonstrate that the proposed strategies can effectively reduce re-identification risk in beacon-like datasets.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Privacy Risks from Genomic Data-Sharing Beacons.

The human genetics community needs robust protocols that enable secure sharing of genomic data from participants in genetic research. Beacons are web servers that answer allele-presence queries--such as "Do you have a genome that has a specific nucleotide (e.g., A) at a specific genomic position (e.g., position 11,272 on chromosome 1)?"--with either "yes" or "no." Here, we show that individuals...

متن کامل

Game-Based Cryptanalysis of a Lightweight CRC-Based Authentication Protocol for EPC Tags

The term "Internet of Things (IoT)" expresses a huge network of smart and connected objects which can interact with other devices without our interposition. Radio frequency identification (RFID) is a great technology and an interesting candidate to provide communications for IoT networks, but numerous security and privacy issues need to be considered. In this paper, we analyze the security and ...

متن کامل

Quantification of De-anonymization Risks in Social Networks

The risks of publishing privacy-sensitive data have received considerable attention recently. Several deanonymization attacks have been proposed to re-identify individuals even if data anonymization techniques were applied. However, there is no theoretical quantification for relating the data utility that is preserved by the anonymization techniques and the data vulnerability against de-anonymi...

متن کامل

Enhancing privacy of recent authentication schemes for low-cost RFID systems

Nowadays Radio Frequency Identification (RFID) systems have appeared in lots of identification and authentication applications. In some sensitive applications, providing secure and confidential communication is very important for end-users. To this aim, different RFID authentication protocols have been proposed, which have tried to provide security and privacy of RFID users. In this paper, we a...

متن کامل

Why Pseudonyms Don’t Anonymize: A Computational Re-identification Analysis of Genomic Data Privacy Protection Systems

Objectives: In order to supply patient-derived genomic data for medical research purposes, care must be taken to protect the identities of the patients from unwanted intrusions. Recently, several protection techniques that employ the use of trusted third parties (TTPs), and other identity protecting schemas, have been proposed and deployed. The goal of this paper is to analyze the susceptibilit...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 24  شماره 

صفحات  -

تاریخ انتشار 2017